20 research outputs found

    Information retrieval and text mining technologies for chemistry

    Get PDF
    Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.A.V. and M.K. acknowledge funding from the European Community’s Horizon 2020 Program (project reference: 654021 - OpenMinted). M.K. additionally acknowledges the Encomienda MINETAD-CNIO as part of the Plan for the Advancement of Language Technology. O.R. and J.O. thank the Foundation for Applied Medical Research (FIMA), University of Navarra (Pamplona, Spain). This work was partially funded by Consellería de Cultura, Educación e Ordenación Universitaria (Xunta de Galicia), and FEDER (European Union), and the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of UID/BIO/04469/2013 unit and COMPETE 2020 (POCI-01-0145-FEDER-006684). We thank Iñigo Garciá -Yoldi for useful feedback and discussions during the preparation of the manuscript.info:eu-repo/semantics/publishedVersio

    The Beni-Ilmane (Algeria) seismic sequence of May 2010: Seismic sources and stress tensor calculations,

    No full text
    International audienceA moderate earthquake with a moment magnitude of Mw 5.5 struck the Sub-Bibanique region of eastern Algeria on 14 May 2010, killing three people, injuring hundreds of others, and causing moderate damages in the epicentral area, mainly in the villages of Beni-Ilmane and Samma. The focal mechanism of the seismic source for the first shock, obtained by near-field waveform modelling, exhibits left-lateral strike-slip faulting with the first nodal plane oriented at N345°, and right-lateral strike-slip faulting with the second nodal plane oriented at N254°. A second earthquake that struck the region on 16 May 2010, with a moment magnitude of Mw 5.1, was located 9 km SW of the first earthquake. The focal mechanism obtained by waveform modelling showed reverse faulting with nodal planes oriented NE–SW (N25° and N250°). A third earthquake that struck the region on 23 May 2010, with a moment magnitude of Mw 5.2, was located 7 km S of the first shock. The obtained focal mechanism showed a left-lateral strike-slip plane oriented at N12° and a right-lateral strike-slip plane oriented at N257°. Field investigations combined with geological and seismotectonic analyses indicate that the three earthquake shocks were generated by activity on three distinct faults. The second and third shocks were generated on faults oriented WSW–ENE and NNE–SSW, respectively. The regional stress tensor calculated in the region gives an orientation of N340° for the maximum compressive stress direction (σ1) which is close to the horizontal, with a stress shape factor indicating either a compressional or a strike-slip regime

    A Hybrid approach for biomedical relation extraction using finite state automata and random forest-weighted fusion

    No full text
    ComunicaciĂł presentada a: The 18th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2017), celebrada a Budapest, Hungria, del 17 al 23 d'abril de 2017.The automatic extraction of relations between medical entities found in related texts is considered to be a very important task, due to the multitude of applications that it can support, from question answering systems to the devel-opment of medical ontologies. Many different methodologies have been pre-sented and applied to this task over the years. Of particular interest are hybrid approaches, in which different techniques are combined in order to improve the individual performance of either one of them. In this study, we extend a previ-ously established hybrid framework for medical relation extraction, which we modify by enhancing the pattern-based part of the framework and by applying a more sophisticated weighting method. Most notably, we replace the use of regu-lar expressions with finite state automata for the pattern-building part, while the fusion part is replaced by a weighting strategy that is based on the operational capabilities of the Random Forests algorithm. The experimental results indicate the superiority of the proposed approach against the aforementioned well-established hybrid methodology and other state-of-the-art approaches.This work was supported by the project KRISTINA (H2020-645012), funded by the European Commission. Deidentified clinical records used in this research were provided by the i2b2 National Center for Biomedical Computing funded by U54LM008748 and were originally prepared for the Shared Tasks for Challenges in NLP for Clinical Data organized by Dr. Ozlem Uzuner, i2b2 and SUNY

    Task-Oriented Complex Ontology Alignment: Two Alignment Evaluation Sets

    Get PDF
    International audienceSimple ontology alignments, largely studied, link one entity of a source ontology to one entity of a target ontology. One of the limitations of these alignments is, however, their lack of expressiveness which can be overcome by complex alignments. Although different complex matching approaches have emerged in the literature, there is a lack of complex reference alignments on which these approaches can be systematically evaluated. This paper proposes two sets of complex alignments between 10 pairs of ontologies from the well-known OAEI conference simple alignment dataset. The methodology for creating the alignment sets is described and takes into account the use of the alignments for two tasks: ontology merging and query rewriting. The ontology merging alignment set contains 313 correspondences and the query rewriting one 431. We report an evaluation of state-of-the art complex matchers on the proposed alignment sets
    corecore